Grid Computing Workloads: Bags of Tasks, Workflows, Pilots, and Others

نویسندگان

  • Alexandru Iosup
  • Dick Epema
چکیده

In the mid 1990s, the grid computing community promised the ”compute power grid,” a utility computing infrastructure for scientists and engineers. Since then, a variety of grids have been built world-wide—for academic purposes, for specific application domains, for general production work. Understanding the workloads of grids is important for the design and tuning of future grid resource managers and applications, especially in the recent wake of commercial grids and clouds. This article presents an overview of the most important characteristics of grid workloads in the past seven years (2003-2010). Starting from the data collected by the authors in the Grid Workloads Archive, this study focuses on four main axes of characterization: system usage, user population, general application characteristics, and characteristics of grid-specific application types. The utilizations of grids vary widely, but are stable in the long term. Although grid user populations range from tens to hundreds of individuals, a few users dominate each grid’s workload both in terms of consumed resources and of number of jobs submitted to the system. Real grid workloads include very few parallel jobs but many independent single-machine jobs (tasks) grouped into single ”bags of tasks.”

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Trace-based Performance Analysis of Scheduling Bags of Tasks in Grids

Grid computing promises large scale computing facilities based on distributed systems. Much research has been done on the subject of increasing the performance of grids. We believe that an adequate performance analysis of grids requires knowledge of the workload and the architecture of the grid. Currently, researchers assume that grids are similar to other distributed systems, such as massively...

متن کامل

Online Scheduling of Workflow Applications in Grid Environment

Scheduling workflow applications in grid environments is a great challenge, because it is an NP-complete problem. Many heuristic methods have been presented in the literature and most of them deal with a single workflow application at a time. In recent years, there are several heuristic methods proposed to deal with concurrent workflows or online workflows, but they do not work with workflows c...

متن کامل

Online scheduling of workflow applications in grid environments

Scheduling workflow applications in grid environments is a great challenge, because it is an NPcomplete problem. Many heuristic methods have been presented in the literature and most of them deal with a single workflow application at a time. In recent years, several heuristic methods have been proposed to deal with concurrent workflows or online workflows, but they do not work with workflows co...

متن کامل

Job Management and Task Bundling

High Performance Computing is often performed on scarce and shared computing resources. To ensure computers are used to their full capacity, administrators often incentivize large workloads that are not possible on smaller systems. Measurements in Lattice QCD frequently do not scale to machine-size workloads. By bundling tasks together we can create large jobs suitable for gigantic partitions. ...

متن کامل

GEMTC: GPU Enabled Many-Task Computing

Current software and hardware limitations prevent Many-Task Computing (MTC) workloads from leveraging hardware accelerators (NVIDIA GPUs, Intel Xeon Phi) boasting Many-Core Computing architectures. Some broad application classes that fit the MTC paradigm are workflows, MapReduce, high-throughput computing, and a subset of high-performance computing. MTC emphasizes using many computing resources...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010